Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DeviceSanitizer] Checking "sycl::free" related errors #1402

Merged
merged 54 commits into from
Apr 17, 2024

Conversation

AllanZyne
Copy link
Contributor

@AllanZyne AllanZyne commented Mar 1, 2024

LLVM: intel/llvm#12882

See details in LLVM PR.

kbenzie and others added 18 commits February 2, 2024 08:46
Silence warnings about using outdated version of Node.js due to using
actions/[email protected].
If an OpenCL driver returns `CL_PROGRAM_BINARY_TYPE_INTERMEDIATE`
(defined in the `cl_khr_spir` extension to enable SPIR 1.2 not SPIR-V)
when querying `CL_PROGRAM_BINARY_TYPE` we should handle this gracefully.
This patch maps the `CL_PROGRAM_BINARY_TYPE_INTERMEDIATE` to
`UR_PROGRAM_BINARY_TYPE_NONE`. From the users perspective these are
semantically equivalent as they both require compilation before the
program can be used to create a kernel.
Replace uses of `sizeof(const char*)` with `std::strlen(const char*)` in
CUDA adapter specific kernel tests.
These changes to the *Adapter Change Process* aim to bring the process
up to date with current practice by describing how the *ready to merge*
label is being used and also to move the undrafting of *intel/llvm* pull
requests earlier in the process to garner approvals from code-reviewers
sooner to alleviate a bottleneck in the merge pipeline.
@AllanZyne AllanZyne changed the title [SanitizerLayer] Checking "sycl::free" related errors [DeviceSanitizer] Checking "sycl::free" related errors Mar 1, 2024
source/loader/layers/sanitizer/linux/backtrace.cpp Outdated Show resolved Hide resolved
source/loader/layers/sanitizer/stacktrace.cpp Outdated Show resolved Hide resolved
source/loader/layers/sanitizer/stacktrace.cpp Outdated Show resolved Hide resolved
source/loader/layers/sanitizer/symbolizer_llvm.cpp Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 3.33333% with 580 lines in your changes are missing coverage. Please review.

Project coverage is 12.44%. Comparing base (78ef1ca) to head (3c55cac).
Report is 142 commits behind head on main.

Files Patch % Lines
...ource/loader/layers/sanitizer/asan_interceptor.cpp 3.10% 187 Missing ⚠️
source/loader/layers/sanitizer/asan_report.cpp 0.00% 90 Missing ⚠️
...rce/loader/layers/sanitizer/ur_sanitizer_utils.cpp 0.00% 65 Missing ⚠️
...urce/loader/layers/sanitizer/asan_shadow_setup.cpp 26.92% 38 Missing ⚠️
source/loader/layers/sanitizer/ur_sanddi.cpp 0.00% 36 Missing ⚠️
...ource/loader/layers/sanitizer/asan_interceptor.hpp 0.00% 28 Missing ⚠️
source/loader/layers/sanitizer/symbolizer_llvm.cpp 0.00% 27 Missing ⚠️
source/loader/layers/sanitizer/stacktrace.cpp 0.00% 26 Missing ⚠️
source/loader/layers/sanitizer/linux/backtrace.cpp 0.00% 20 Missing ⚠️
source/loader/layers/sanitizer/asan_quarantine.hpp 0.00% 15 Missing ⚠️
... and 7 more

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1402      +/-   ##
==========================================
- Coverage   14.82%   12.44%   -2.38%     
==========================================
  Files         250      251       +1     
  Lines       36220    36265      +45     
  Branches     4094     4117      +23     
==========================================
- Hits         5369     4514     -855     
- Misses      30800    31747     +947     
+ Partials       51        4      -47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pbalcer pbalcer added the ready to merge Added to PR's which are ready to merge label Apr 5, 2024
@AllanZyne AllanZyne requested review from a team as code owners April 7, 2024 01:53
@AllanZyne
Copy link
Contributor Author

Hi @pbalcer.
I know you are busy with your release plan, but can you help to plan to merge this PR more earlier?
Because this PR modified the code layout a lot, so it blocked our future patches.
Our future patches will try to be small and independent to each other.
Thank you!

@kbenzie
Copy link
Contributor

kbenzie commented Apr 8, 2024

@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-command-buffer-write please review

@AllanZyne
Copy link
Contributor Author

@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-command-buffer-write please review

Thanks!

@EwanC
Copy link
Contributor

EwanC commented Apr 9, 2024

@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-command-buffer-write please review

This code looks unrelated to command-buffers, could somebody explain why @oneapi-src/unified-runtime-command-buffer-write has been tagged to review it if it is related.

@AllanZyne
Copy link
Contributor Author

@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-command-buffer-write please review

This code looks unrelated to command-buffers, could somebody explain why @oneapi-src/unified-runtime-command-buffer-write has been tagged to review it if it is related.

I guess this is because the first few commits.
This PR is unrelated to @oneapi-src/unified-runtime-level-zero-write as well.
Should I remove those commits that were introduced unintentionally?

@pbalcer
Copy link
Contributor

pbalcer commented Apr 9, 2024

@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-command-buffer-write please review

This code looks unrelated to command-buffers, could somebody explain why @oneapi-src/unified-runtime-command-buffer-write has been tagged to review it if it is related.

I guess this is because the first few commits. This PR is unrelated to @oneapi-src/unified-runtime-level-zero-write as well. Should I remove those commits that were introduced unintentionally?

No, we can just squash this during merge. I believe all necessary reviews already happened, so this should be ready to merge.

@kbenzie
Copy link
Contributor

kbenzie commented Apr 10, 2024

@AllanZyne there's still a pending approval on intel/llvm#12882 from the @intel/dpcpp-tools-reviewers team so I pinged them for a review. Once that approval is in place we can merge this.

@kbenzie kbenzie added loader Loader related feature/bug sanitizer Sanitizer layer issues/changes/specification labels Apr 10, 2024
@kbenzie
Copy link
Contributor

kbenzie commented Apr 17, 2024

@AllanZyne I see the approval came in, we'll merge this next once intel/llvm#12447 goes in.

@xtian-github
Copy link

@kbenzie did you get all necessary approval for merging this PR? We need to merge it asap for Aurora project deliverable. Thanks.

@kbenzie
Copy link
Contributor

kbenzie commented Apr 17, 2024

@kbenzie did you get all necessary approval for merging this PR? We need to merge it asap for Aurora project deliverable. Thanks.

Everything is there. As I mentioned above:

@AllanZyne I see the approval came in, we'll merge this next once intel/llvm#12447 goes in.

That's not merge yet but we're working on it.

@xtian-github
Copy link

@kbenzie did you get all necessary approval for merging this PR? We need to merge it asap for Aurora project deliverable. Thanks.

Everything is there. As I mentioned above:

@AllanZyne I see the approval came in, we'll merge this next once intel/llvm#12447 goes in.

That's not merge yet but we're working on it.

THANKS!

@aarongreig aarongreig merged commit 003d4da into oneapi-src:main Apr 17, 2024
52 checks passed
@AllanZyne
Copy link
Contributor Author

Thanks for your help, @xtian-github!

aarongreig added a commit to aarongreig/unified-runtime that referenced this pull request Apr 18, 2024
…free"

This reverts commit 003d4da, reversing
changes made to e38e79e.
steffenlarsen pushed a commit to intel/llvm that referenced this pull request Apr 19, 2024
UR: oneapi-src/unified-runtime#1402

This PR added supports for checking the following types of error in
"UR_LAYER_ASAN":

- bad-free: the memory address to be freed is not allocated by UR
- bad-context: the memory address to be freed uses a wrong "context"
- double-free: the memory address to be freed is already freed 
- use-after-free: the freed memory is used in kernel

I added the environment variable "UR_LAYER_ASAN_OPTIONS" to have
additional control over "UR_LAYER_ASAN", which is similar to
"ASAN_OPTIONS" in
[ASan](https://github.com/google/sanitizers/wiki/AddressSanitizerFlags).
Currently, it supports:

- "quarantine_size_mb" (default = 0)
- Size (in MB) of quarantine per device. The pointers passed to
urUSMFree are not freed immediately, but saved into QuarantineCache (per
device cache), and when the cached chunk size (only counts the size of
USM buffer, not shadow memory) is more than "quarantine_size_mb", the
first enqueued chunk will be freed (aka., FIFO). Lower value may reduce
memory usage but increase the chance of false negatives
- This option must be enabled for checking "double-free" and
"use-after-free"
- "debug" (default = 0)
- Print extra debug messages in kernel ("__AsanDebug” in
“libdevice/sanitizer_utils.cpp”), which is helpful for DeviceSanitizer
developers.

For example, to enable "use-after-free" with 5MB quarantine cache and
debug message in kernel, you need to
```bash
UR_LAYER_ASAN_OPTIONS="quarantine_size_mb:5;debug:1" ./sycl_app
```

---------

Co-authored-by: Maosu Zhao <[email protected]>
Co-authored-by: Aaron Greig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loader Loader related feature/bug ready to merge Added to PR's which are ready to merge sanitizer Sanitizer layer issues/changes/specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants